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Abstract 



The enormous successes have been made by quantum algorithms dur- 
ing the last decade. In this paper, we combine the quantum random walk 
. (QRW) with the problem of data clustering, and develop two clustering 

algorithms based on the one dimensional QRW. Then, the probability 
distributions on the positions induced by QRW in these algorithms are 



■ investigated, which also indicates the possibility of obtaining better re- 

I ' suits. Consequently, the experimental results have demonstrated that 

data points in datasets are clustered reasonably and efficiently, and the 
clustering algorithms are of fast rates of convergence. Moreover, the com- 
^ ' parison with other algorithms also provides an indication of the effective- 

ness of the proposed approach. 
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92 ■ 1 Introduction 

O 

Quantum computation is an extremely exciting and rapidly growing field of 
i^j ■ investigation, and has attracted a lot of interests. More recently, an increas- 

, ing number of researchers with different backgrounds, ranging from physics, 

d ' computer sciences and information theory to mathematics and philosophy, are 

involved in researching properties of quantum-based computation J^. During 
the last decade, a series of significant breakthroughs have been made. One was 
that in 1994 Peter Shor surprised the world by describing a polynomial time 
quantum algorithm for factoring integers 0] , while in classical world this was a 
NP-complete problem that didn't find an efficient algorithm. Three years later, 
in 1997, Lov Grover proved that a quantum computer could search an unsorted 
database in only the square root of the time ^3,. Meanwhile, Gilles Brassard et 
al. combined ideas from Grover's and Shor's quantum algorithms to propose a 
quantum counting algorithm 4 . 

In recent years, many interests focus on quantum random walks (QRW) and 
considerable work has been done For instance, some bounds were given 

on general graphs [S], where, for the standard deviation and the mixing time, a 
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quadratic speed up were reported over the classical counterparts on the line and 
cycle. Later, J. Kempe [5] proved that the hitting time from one corner to the 
opposite corner on a n-bit hypercube showed an exponential speed up, and A. 
M. Childs et al. [TU] used a continuous time quantum walk to traverse a special 
graph exponentially faster than any classical algorithms. 

Successes that have been made by quantum algorithms make us guess that 
powerful quantum computers can figure out solutions faster and better than the 
best known classical counterparts, or even solve certain problems that classical 
computer cannot solve. Furthermore, it is more important that they offer a 
new way to find potentially dramatic algorithmic speed-ups. Therefore, we may 
ask naturally: can we construct quantum versions of classical algorithms or 
present new quantum algorithms to solve the problems in pattern recognition 
faster and better on the quantum computer? Following this idea, some pioneers 
have proposed their novel methods and demonstrated exciting consequences 

In this paper, we attempt to combine the QRW with the problem of data 
clustering in order to establish a novel QRW based clustering algorithm. QRWs 
differ from classical random walks in that their evolution is unitary and re- 
versible. In the discrete case, an extra "coin" degree of freedom (usually a single 
quantum bit) is introduced into the system. Just like the classical random walk, 
the particle's moves depend on the outcome of a "coin flip". However, in the 
quantum case, both the "coin flip" and the conditional shift of the particle are 
unitary transformations, and different possible classical paths can interfere with 
each other. 

In our algorithms, data points in a dataset are viewed as particles that can 
walk at random in an m-dimensional metric space according to certain rules. 
Further, each data point may be regarded as a local control subsystem, whose 
controller controls its walking behavior. From the point of view of control theory, 
the system with N particles which walk randomly in space may be described 
by the below block diagram. 



Controller Ci 



Transition probability matrix Pi 



Position matrix Xi 



Controller C2 — Transition probability matrix P2 Position matrix X 



E 



Controller C„ — Transition probability matrix P„ — Position matrix X, 



Figure 1: The block diagram of the system. 



As is shown in Fig. (TJ the controlled object is the Transition Probability 
matrix P, and the outputs of the system are the new positions of all particles in 
the system. The controller C adjusts the entries in the Transition Probability 
matrix P according to the current positions of iV particles, and then decides the 
transition directions and transition distances of N particles at the next moment. 
Finally, the positions of N particles are updated synchronously. As data points 
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move in the space at random, they gather together graduaUy and form some 
separating clusters automatically. 

The remainder of this paper is organized as follows: Section 2 introduces 
some important concepts about the quantum computation and the quantum 
random walk briefly. Section 3 elaborates two proposed clustering algorithms 
based on one dimensional QRW. Section 4 discusses the relationship between 
the number of clusters and the number of nearest neighbors firstly, and then 
the effects of number of steps in the ID-scms and ID-mcms algorithms are 
investigated. Section 5 introduces those datasets used in the experiments briefly, 
and then compares experimental results of the proposed algorithms with other 
clustering algorithms. The conclusion is given in Section6. 

2 Quantum computation and Quantum random 
walk 

2.1 Quantum computation [1], 114] 

The elementary unit of quantum computation is called the qubit, which is 
typically a microscopic system, such as an atom, a nuclear spin, or a polarized 
photon. In quantum computation, the Boolean states and 1 are represented by 
a prescribed pair of normalized and mutually orthogonal quantum states labeled 
as {|0), |1)} to form a 'computational basis'. Any pure state of the qubit can 
be written as a superposition state a|0) + for some a and f3 satisfying 
|ap + = 1. A collection of n qubits is called a quantum register of size n, 
which spans a Hilbert space of 2" dimensions, and so 2" mutually orthogonal 
quantum states can be available. 

Quantum state preparations, and any other manipulations on qubits, have 
to be performed by unitary operations. A quantum logic gate is a device which 
performs a fixed unitary operation on selected qubits in a fixed period of time, 
and a quantum circuit is a device consisting of quantum logic gates whose com- 
putational steps are synchronized in time . The most common quantum gate is 
the Hadamard gate, which acts on a qubit in state |0) or |1) to produce 



2.2 Quantum random walk [7], 115] 

In this subsection, we focus on the discrete model of the quantum random 
walk in one dimension whose notion will be formally defined. First, let Hp 
be the Hilbert space spanned by basis states {\i)} representing the position of 
the particles, and He be the two-dimensional coin space spanned by two basis 
states {It), 11)}. So the total Hilbert space is given hy H — Hp (g) He- Just like 
the classical random walk, the evolution of a quantum random walk is divided 
into two subsequent operations: the coin operation and the conditional shift 
operation. 

The coin operation C rotates a state, |t) or ||), in He to render the coin 
state in a superposition, which is in analogy to the coin flip in classical random 




(1) 
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walk. One can design different unitary transformation C to observe different 
behavior of walks, while the Hadamard coin, a balanced coin, is commonly used, 
which gives equal chances to move left and right. 

For the conditional shift operation 5, it makes the walker take a step to the 
right or to the left in terms of the accompanying coin state, which has the form 

^ = IT>(TI®El* + i)(*l + U)UI®El*-i)(^l- (2) 

i i 

Therefore, the evolution of the system at each step of the walk can be de- 
scribed by the total unitary transformation U = S ■ [C ® I), where / is the 
identity operator on Hp. If the transformation U is applied to an initial state 
T > 2 times, a different probability distribution will be yielded before the state 
are measured. For example, assume the initial state is |'(/'o) = |T) ® |0)i after 
T = 100 steps, the probability distribution on the positions of the particle is 
shown in Fig. [21 where the other curve represents the probability distribution 
of a classical random walk. 



I ■ Qjanlum | 




Figure 2: The probability distribution for a classical random walk and a quan- 
tum random walk. 



3 Algorithms 

One dimensional QRW as a standard model has been widely investigated [5, 
7]. On the basis of this fact, for a problem of data clustering using QRW 
in a high dimensional space, a natural idea is to divide the m-dimensional 
QRW into m one-dimensional-QRWs. In addition, assume an unlabeled dataset 
X = {Xj, Xq, • • • , Xq }, whose each instance is with m features. In the cluster- 
ing algorithm based on the QRW, each data point in the dataset is regarded as a 
movable particle which can walk in the whole space according to transition prob- 
abilities. In this section, two clustering algorithms based on one dimensional 
QRW are constructed which are named respectively: (a) ID-single-coin-multi- 
step, (b) ID-multi-coin-multi-step. 

3.1 ID-single-coin-multi-step (ID-scms) algorithm 

In all clustering algorithms throughout this paper, a decentralized control 
strategy is employed, i.e., each data point in the dataset only interacts with its 
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neighbors in its neighborhood who are selected by a fc-nearest-neighbor method. 
Next, the transition probabihties, Pt{i,j),j G rt(i), are computed as below, 



Pt[^,J) 



^ — ifjert(z) 

otherwise 




{d{x\.Xl))x{d{xi,,Xi) 



where the variable Tt{i) represents a neighbor set of a data point X\] Degt{-) 
and Dego{-) denote the current and initial degrees of a data point respectively; 
likewise, Xj) and ^(-X'q, Xq) denote the current and initial distances of 

the data point respectively. And then the largest transition probability /i) 
and the neighbor with this probability , /i e Tt{i).,h ^ i are identified. 

As mentioned above, the quantum random walk in an m-dimensional space is 
divided into m one-dimensional-QRWs. So, for each one of m one-dimensional- 
walks, a data point X\ only can move a step of length 1^ or In either left or 
right. Therefore, the maximal transition probability h) needs to be mapped 
into an interval, p — f{pt{i,h)) G [0.5, 1], while the probability moving in the 
opposite direction is 1 — p. li p = 0.5, the Hadamard transformation H may 
be applied as a balanced unitary coin in the one dimensional quantum random 
walk. In general case, however, p ^ 1 — p, the balanced unitary coin is replaced 
by a bias coin called the transformation C which is given below. 



C 



/ ) ,P^ f{pt{t,h))^ f(max{pt{i,j))). (4) 



It can be easily verified that the transformation C is unitary, which satisfies the 
requirement of reversibility in Quantum Mechanics. 

As is known, each term in a superposition state may be viewed as a position 
that the particle is located at and indicates the probability that it is found 
at that place. If the transformation U is applied to the initial state twice or 
more, then more terms are yielded in the superposition state 1-0), which provides 
more positions to appear for the particle. Furthermore, these very positions 
enlarge the search area in the solution space and supply opportunities to obtain 
better results. Thanks to the quantum parallelism [1], it needs only one unitary 
operation to compute all positions and the corresponding probabilities appearing 
on them, which is unconceivable in a classical world. 

Further, in the ID-scms algorithm the total transformation U ~ S ■ {C ^ I) 
will be applied to the initial state continuously r times, so the step lengths 
moving left and right,//, and Ir, are reduced by a factor of r, namely ///r, In/r. 
Thus, using a method similar to Eq. ([5]), a conditional shift transformation S 
can be expressed as the following unitary operator. 

^ = I T) (T I ® E 1^ + WO {b\ + \i){i\®J2\^'lL/r){b\ (5) 

b b 

If the initial state of a particle is |^o) = IT) ® |0), after the total transfor- 
mation is applied r — 2 times, the obtained superposition state \ip) is given as 
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following 



m ^ VP 1 1> ® \lR/r) + ,/l—p I i) ® I - h/r) 

^ p\'i)®\lB) + Vp(^)\l)®WR-lL)/r) (6) 
+ (1 - p) 1 1> ® KZfl - lL)/r) ~ ^^(T^)\i) ®\-Il) = I^). 

From Eq. we can see that the particle is not only on the position and II 
with probabilities and pil — p), but also at the same time appears on a new 
position {Ib, — II)/'^ with probability (1 — p). At this time, if the superposition 
state is measured, it will collapse to one of three positions with probability. 
The corresponding component of the m-dimensional vector X\ will be updated 
according to the following formulation, 

{X\{j) + In if on position 

+ i^R - h)lr if on position {Ir - lL)/r 
X\{j) — II if on position —l^ (7) 

Zfl-px (X;'0-)-^t(j)),je{l,2,... ,m} 
/l = (1 - p) X {X'lij) - X\{j)),3 e {1, 2, • • • ,m}. 

As data points move in space at random, their positions are constantly 
changing and the nearest neighbors of each data point also vary over time. So, 
the distances among data points and the degree of each data point need to be 
recomputed in the process. When the whole process is repeated until the sum 
of walking step lengths of all data points is less than a preset threshold £, the 
algorithm exits. 



3.2 ID-multi-coin-multi-step (ID-mcms) algorithm 

In the previous algorithm, only the largest transition probability is used 
to establish the transformation C . If every one of all transition probabilities 
Ptihj),] G rt(i) is employed to construct a transformation Cj and different 
step lengths are used, |rt(i)| total transformations, Uj = Sj ■ {Cj ® I),j = 
1, 2, • • • , |ri(i)|, will be produced, where the symbol | • | represents the cardinality 
of a set. When all transformations Uj are applied respectively, the probability 
distribution on the positions will be largely different from that in the ID-scms 
algorithm. 

Therefore, after the transition probabilities pt{i,j),j S rt(i) are computed, 
k — |rt(i)| transformations Cj and conditional shift transformations Sj are 
established. So 

I T> (T I ® E J& + Ir,,) {b\ + \i)a\®Eb\b- II.,) {b\) 

. (8) 

^ = fiptiij)),] e rf(i). 




When all transformations Uj are applied to the initial state |V'o)j the obtained 
superposition state is I'lp) = Uk ■ ■ ■ C^2(J7i|V'o))- 
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If the initial state of a particle is jV'o) = | T) ® |0) and fc = 2, after the 
transformations Ui,U2 are applied, the superposition state takes the form 
as below 



m I T> ® \lRA/k) + V^r^i lL,i/k) 



+^{i-m){i-m)\]) ® \{Ir.2 - iL,i)lk) - ^m{^-m)\i) ® I - {Ila + iL,2)/k) = |^). 

(9) 

From Eq. ([9|), we can see that the particle appears on four positions with dif- 
ferent probabilities. Similarly, when the superposition state is measured, it will 
collapse to one of four positions with probability. The corresponding component 
of the m-dimensional vector X\ will be updated by the following formulation. 



= < 



X\[j) + {Ir^i + lR,2)/k, if on position (^^.^i + ^^,2)/^ 
-X't(i) + Q'R.i - h,2)/k, if on position {Ir^i - lL,2)/k 
^tU) + ih.,2 - lL,i)/k, if on position {Ir^2 - lLs)/k 
^ Xl{j) - {Ila + h,2)/k if on position -{Il.i + h,2)/k 
Ir^, = m X iXlij) X\{j)), Ila = (1 - Vi) x {X{\j) - X\{j)), 
Ir^2 = V2 X (Xlij) XKj)), ha = (1 - m) X {X'iij) - X\(j)), 
j e {1,2, - • • ,to}. 

The steps of four algorithms are summarized in Table [TJ 



(10) 



4 Discussion 



In the section, firstly, we discuss how the number of clusters is affected by 
the number k of nearest neighbors changing. Then, in the ID-scms and ID- 
mcms algorithms, the relationship between the steps (times applying the total 
transformation) and the clustering accuracies of algorithms are investigated. 



4.1 Number of nearest neighbors vs. number of clusters 

The number k of nearest neighbors represents the number of neighbors to 
which a data point X] £ X connects. If the longest distance among the data 
point and its k nearest neighbors is selected as a radius, then a virtual circle 
centered around the data point can be drawn. This circle may be viewed as 
the interaction range of the data point whose radius follows the increase of the 
number of nearest neighbors. For a dataset, the number k of nearest neighbors 
determines the number of clusters in part. Generally speaking, the number of 
clusters decreases with the increase of the number of nearest neighbors. For 
example, if the number k of nearest neighbors is small, the interaction range of 
a data point is small too. Further, considering connectivity of a graph, we can 
find that the interaction ranges of data points intersect each other slightly, so 
that the connected domain formed is also small. In the process of data points 
moving in space, they will be close to one another gradually, which causes both 
the interaction ranges of data points and the connected domain on a graph are 
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Table 1: Steps of clustering algorithm. 
Select a distance function d{-, •) 
Initialization: 

Set the number of nearest neighbors k and the separating threshold 9 
Compute initial distance matrix D(0)„xn = [^(J^O' ^o)]«j=i-2,- - 
and initial degree vector Deg{0)nxi = [Dego(i)]i=i^2,--- ,n 

Repeat: 

Compute current distance matrix D{t)nxn — [d{Xl, Xj)]ij=i.2, -- ,n 
Identify the current neighbor set Tt{i) for each data point 
Compute the current degree vector Deg(t)nxi — [Degt(i)]i=i,2,--- .n 
For each data point XI 

Compute transition probabilities Pt{hj),j € ^tii) according to Eq. 3 
ID-scms: Establish the coin transformation C and 

the transformation U using Eq. 4 and Eq. 8 
Apply the transformation to the initial state t times in each dimension 
Measure the state and update each component of X] according to Eq. 10 
ID-mcms: Establish the transformation Uj using Eq. 11 
Apply the transformation to the initial state k times in each dimension 
Measure the state and update each component of X] according to Eq. 13 
Compute sum of transition distances of each data point uji = X^Jli h 
End For 

Until J2"=i < £ 



decreased further. Hence, in this case, they gather together only with not-too- 
distance data points around them. As a consequence, all data points form many 
small clusters, as is shown in Fig. [3l^a). 

On the other hand, if the number k of nearest neighbors is large, the inter- 
action ranges of data points will be increased at the same time, which makes 
them intersect each other largely. Thus, the larger connected domains on a 
graph are formed. Even if the interaction ranges of data points are decreased 
due to data points moving and approaching each other in space, the larger 
connected domains are established in contrast to that when selecting a small 
number k of nearest neighbors. Finally, several big clusters are formed. Fig. [3] 
illustrates the relationship between the number of clusters and the number of 
nearest neighbors. As is analyzed above, eight clusters are obtained by the clus- 
tering algorithm, when fc = 8. As the number k of nearest neighbors rises, five 
clusters are obtained when fc = 14 in Fig. ^h), three clusters when k — 22 in 
Fig. [3fc). So, if the exact number of clusters is not known in advance, differ- 
ent number of clusters may be achieved by adjusting the number k of nearest 
neighbors in practice. 

4.2 Effect of number of steps in the ID-scms and ID-mcms 
algorithms 

In ID-scms algorithm, the coin transformation C is constructed on the basis 
of the largest transition probability, but the steps moving left and right are 
reduced to Il/t and Ir/t. If the total transformation U — S ■ {C I) is apphed 
to the initial state \ipo) r times, the probability distribution on the positions 
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(a) fc = 8 (b) fc = 14 (c) fc = 22 

Figure 3: The number of nearest neighbors vs. number of clusters. 



of the particle will be different from the distribution of using only one time. 
As the times applying the transformation U grow, the probability appearing 
on the position I l or In will drop constantly, while the probability locating at 
a position between 1^ and Iji will rise, and later the position with the largest 
probability will also move and approach to II or Ifi. Hence, if this is carried 
to the extreme, in the limit case, the ID-scms algorithm will approach to the 
distribution produced by applying the total transformation U only once, but 
the the largest probability drops. For instance, in each dimension, assume the 
initial state of a particle is \tpo) — \1) (g) |0); the transition probabilities moving 
right and left are p and 1 — p{p > 1 — p); and the steps are Il/t — Ir/t — 1. 
After the transformation U is applied to the initial state t = 30, 60, 100 times 
respectively, when p = 0.8, the probability distribution on the positions of the 
particle is illustrated in Fig. [Ifa). 




(a) (b) (c) 



Figure 4: The steps vs. clustering accuracy in the ID-scms algorithm. 

As is shown in Fig. IUJa), for the same initial state, the probability dis- 
tributions are different from each other, and the positions with the maximal 
probabilities tend to be close to the position Ir due to the increase of the times 
r. Fig. [DJb) exhibits the relationship between the clustering accuracy (for defi- 
nition, see the section 5.2) and the number of nearest neighbors, in which each 
curve is obtained at a fixed r. Further, the mean and variance of points in each 
curve is drawn in Fig.UJc) which shows that the means are not monotonously in- 
creasing when the times vary, but drop slightly after r — Q, since too many times 
cause the degradation of the algorithm. So, for avoiding this, we recommend 
the times take r = 5 or r = 6. 

The ID-mcms algorithm differs from ID-scms algorithm in that k — \Tt{i)\ 
transformations Cj and Uj = Sj ■ {Cj (g) I),j = 1,2,--- ,fc, are constructed 
and applied. As such, even if the same initial state \iPq) = | t) ® |0) is em- 
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(a) (b) (c) 



Figure 5: The steps vs. clustering accuracy in the ID-mcms algorithm. 

ployed, the obtained probability distribution on the positions is also different, 
which is shown in Fig. El^a). Again, the number of total transformations is 
associated with the number of nearest neighbors directly. Thus, if the num- 
ber of nearest neighbors is large, then more total transformations Uj will be 
established, and the times applying them increases naturally. Similarly, from 
Fig. m^c), we can observe that the degradation occurs in ID-mcms algorithm 
too, when the times are large. Therefore, in practice, k transition probabili- 
ties Pt{i,j),j G rt(i) are sorted in descending order firstly, and then the first 
r largest transition probabilities are selected to construct the transformations 
Uj — Sj ■ [Cj (E) I), j — 1,2, ■ ■ ■ ,r < k, which is a method to reduce the degra- 
dation of the algorithm. Likewise, we recommend the times take r = 5 or r — 6 
in the ID-mcms algorithm. 

5 Experiment 

To evaluate these four clustering algorithms, we choose six datasets from UCI 
repository 16], which are Soybean, Iris, Sonar, Glass, Ionosphere and Breast 
cancer Wisconsin datasets, and complete the experiments on them. In this 
section, firstly we introduce these datasets briefly, and then demonstrate the 
experimental results. 

5.1 Experimental setup 

The original data points in above datasets all are scattered in high dimen- 
sional spaces spanned by their features, where the description of all datasets is 
summarized in Table [21 As for Breast dataset, those lost features are replaced 
by random numbers. Finally, this algorithm is coded in Matlab 6.5. 



Table 2: Description of datasets. 
Dataset Instances Features Classes 



Soybean 


47 


21 


4 


Iris 


150 


4 


3 


Sonar 


208 


60 


2 


Glass 


214 


9 


6 


Ionosphere 


351 


32 


2 


Breast 


699 


9 


2 
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Throughout aU experiments, data points in a dataset are considered as mov- 
able particles whose initial positions are taken from the datasets directly. Next, 
the k nearest neighbors of each data point in the dataset may be found, after a 
distance function is selected which only needs to satisfy that the more similar 
data points are, the smaller the output of the function is. In the experiments, 
the Euclidean distance function, L2-norm distance, is employed. Additionally, 
in the ID-scms and ID-mcms algorithms, the variable r is set at r ~ 6. 



5.2 Experimental results 

Two above-constructed clustering algorithms are experimented on the six 
datasets respectively. As is analyzed in section 4.1, for a dataset the number 
of clusters decreases with the increase of the number k of nearest neighbors. 
Therefore, when a small k is selected, it is possible that the number of clusters 
is larger than the preset number of clusters in the dataset, after the algorithm is 
end. So a merging-subroutine is called to merge unwanted clusters, which works 
in this way. At first, the cluster with the fewest data points is identified, and then 
is merged to the cluster whose distance between their centroids is smallest. This 
subroutine is repeated till the number of clusters is equal to the preset number. 
Moreover, the algorithms are run on every dataset at the different number of 
nearest neighbors, and clustering results obtained by these four algorithms are 
compared in Fig. in which each point represents a clustering accuracy. 

Definition 1 clusteri is the label which is assigned to a data point Xi in a 
dataset by the algorithm, and Ci is the actual label of the data point Xi in the 
dataset. So the clustering accuracy is f j7| /; 



X{map(clusteri) .Cij 

accuracy = — - 

1 if map{clusteri) = Ci (^^^ 
otherwise 



\{map{clusteri), Ci 



where the mapping function map(-) maps the label got by the algorithm to the 
actual label. 

As is shown in Fig.O the similar results are obtained by these algorithms at 
different nearest neighbors, but almost all the best results are yielded by the ID- 
mcms algorithm. Additionally, we compare our results to those results obtained 
by other clustering algorithms, Kmeans [18 , PCA-Kmeans 18j, LDA-Km |1S], 
on the same dataset. The comparison is summarized in Table |31 



Table 3: Comparison of clustering accuracies of algorithm. 



Algorithm 


Soybean 


Iris 


Sonar 


Glass 


Ionosphere 


Breast 


ID-scms 


91.49% 


90% 


62.02% 


64.49% 


71.51% 


95.42% 


ID-mcms 


97.87% 


96.67% 


62.02% 


64.02% 


75.21% 


95.42% 


Kmeans 


68.1% 


89.3% 




47.2% 


71% 




PCA-Kmeans 


72.3% 


88.7% 




45.3% 


71% 




LDA-Km 


76.6% 


98% 




51% 


71.2% 
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(a) Soybean dataset 



(b) Iris dataset 



(c) Sonar dataset 




(d) Glass dataset (e) Ionosphere dataset (f) Breast dataset 



Figure 6: Comparison of clustering accuracies in all proposed algorithms. 

6 Conclusions 

The enormous successes gained by the quantum algorithms make us realize 

it is possible that the quantum algorithms can obtain solutions faster and bet- 
ter than those classical counterparts. Therefore, we combine the QRW with the 
problem of data clustering, and establish four clustering algorithms based on 
the QRW. In the algorithms, those data points for clustering are considered as 
movable particles which may be also seen as local control subsystems from the 
point of view of control theory. Further, we develop two clustering algorithms 
based on the one dimensional QRW, and discuss the probability distributions on 
the positions induced by the QRW under two different cases: (a) only one trans- 
formation C is constructed but the total transformation U arc applied twice or 
more; (b) more transformations C and U arc constructed and applied. Thanks 
to the quantmn parallelism, all positions and the corresponding probabilities 
are computed by applying the unitary operation only once in contrast to many 
times in classical world. Besides, on the basis of the QRW. the probability dis- 
tributions on the positions that dose not exist in the classical case are produced, 
which provides opportunities for obtaining better results. 

In these algorithms, when the exact number of clusters is unknown in ad- 
vance, one can adjust the number k of nearest neighbors to control the number of 
clusters which decreases with the increase of the number k of nearest neighbors. 
We evaluate the clustering algorithms on six real datasets, and experimental re- 
sults have demonstrated that data points in a dataset are clustered reasonably 
and efficiently. Additionally, these clustering algorithms can also detect clusters 
of arbitrary shape, size and density. 
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